Unsupervised Learning of Disambiguation Rules for Part of Speech Tagging
نویسنده
چکیده
In this paper we describe an unsupervised learning algorithm for automatically training a rule-based part of speech tagger without using a manually tagged corpus. We compare this algorithm to the Baum-Welch algorithm, used for unsupervised training of stochastic taggers. Next, we show a method for combining unsupervised and supervised rule-based training algorithms to create a highly accurate tagger using only a small amount of manually tagged text.
منابع مشابه
معرفی رویکردی ماشینی با استفاده از الگوریتم لسک و برچسبدهی نحوی جهت رفع ابهام از معنای کلمات
The present study introduces a machine-based approach for word sense disambiguation (WSD). In Persian, a morphologically complex language, POS tag which lots of homographs are made, one way for doing WSD is allocating the right Part Of Speech (POS) tags to words prior to WSD. Since the frequency of noun and adjective homographs in different Persian POS tag text corpuses is high, POS tag disambi...
متن کاملA Simple Unsupervised Learner for POS Disambiguation Rules Given Only a Minimal Lexicon
We propose a new model for unsupervised POS tagging based on linguistic distinctions between open and closed-class items. Exploiting notions from current linguistic theory, the system uses far less information than previous systems, far simpler computational methods, and far sparser descriptions in learning contexts. By applying simple language acquisition techniques based on counting, the syst...
متن کاملUnsupervised Part-Of-Speech Tagging Supporting Supervised Methods
This paper investigates the utility of an unsupervised partof-speech (PoS) system in a task oriented way. We use PoS labels as features for different supervised NLP tasks: Word Sense Disambiguation, Named Entity Recognition and Chunking. Further we explore, how much supervised tagging can gain from unsupervised tagging. A comparative evaluation between variants of systems using standard PoS, un...
متن کاملLearning part of speech disambiguation rules using Inductive Logic Programming
A pilot study on inducing rules for part of speech tagging of unrestricted Swedish text is reported. Using the Progol machine-learning system, Constraint Grammar inspired rules were learnt from the part of speech tagged Stockholm-Ume a Corpus. Several thousand disambiguation rules discarding faulty readings of ambiguously tagged words were induced. When tested on unseen data, 97% of the words r...
متن کاملCombining Hand-crafted Rules and Unsupervised Learning in Constraint-based Morphological Disambiguation
This paper presents a constraint-based morphological disambiguation approach that is applicable languages with complex morphology-specifically agglutinative languages with productive inflectional and derivational morphological phenomena. In certain respects, our approach has been motivated by Brill's recent work (Brill, 1995b), but with the observation that his transformational approach is not ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1995